Declarative Parallel Programming for GPUs

نویسندگان

  • Eric Holk
  • William E. Byrd
  • Nilesh Mahajan
  • Jeremiah Willcock
  • Arun Chauhan
  • Andrew Lumsdaine
چکیده

The recent rise in the popularity of Graphics Processing Units (GPUs) has been fueled by software frameworks, such as NVIDIA’s Compute Unified Device Architecture (CUDA) and Khronos Group’s OpenCL that make GPUs available for general purpose computing. However, CUDA and OpenCL are still lowlevel approaches that require users to handle details about data layout and movement across levels of memory hierarchy. We propose a declarative approach to coordinating computation and data movement between CPU and GPU, through a domain-specific language that we called Harlan. Not only does a declarative language obviate the need for the programmer to write low-level error-prone boilerplate code, by raising the abstraction of specifying GPU computation it also allows the compiler to optimize data movement and overlap between CPU and GPU computation. By focusing on the “what”, and not the “how”, of data layout, data movement, and computation scheduling, the language eliminates the sources of many programming errors related to correctness and performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerating high-order WENO schemes using two heterogeneous GPUs

A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...

متن کامل

Declarative Programming Techniques for Many-Core Architectures

Future manycore architectures are likely to have heterogeneous computing resources which will include conventional CPUs as well as variants of today’s GPUs and reconfigurable logic like FPGAs. Many of the techniques that the reconfigurable computing community has championed will find new applications in mainstream applications. One challenge posed by such manycore architectures is the requireme...

متن کامل

CnC-CUDA: Declarative Programming for GPUs

The computer industry is at a major inflection point in its hardware roadmap due to the end of a decades-long trend of exponentially increasing clock frequencies. Instead, future computer systems are expected to be built using homogeneous and heterogeneous many-core processors with 10’s to 100’s of cores per chip, and complex hardware designs to address the challenges of concurrency, energy eff...

متن کامل

A GPU Implementation of Large Neighborhood Search for Solving Constraint Optimization Problems

Constraint programming has gained prominence as an effective and declarative paradigm for modeling and solving complex combinatorial problems. In particular, techniques based on local search have proved practical to solve real-world problems, providing a good compromise between optimality and efficiency. In spite of the natural presence of concurrency, there has been relatively limited effort t...

متن کامل

Heterogeneous Programming with Single Operation Multiple Data

Heterogeneity is omnipresent in today’s commodity computational systems, which comprise at least one multi-core Central Processing Unit (CPU) and one Graphics Processing Unit (GPU). Nonetheless, all this computing power is not being harnessed in mainstream computing, as the programming of these systems entails many details of the underlying architecture and of its distinct execution models. Cur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011